Automatic Expansion of the MRC Psycholinguistic Database Imageability Ratings

نویسندگان

  • Ting Liu
  • Kit Cho
  • George Aaron Broadwell
  • Samira Shaikh
  • Tomek Strzalkowski
  • John Lien
  • Sarah M. Taylor
  • Laurie Feldman
  • Boris Yamrom
  • Nick Webb
  • Umit Boz
  • Ignacio Cases
  • Ching-Sheng Lin
چکیده

Recent studies in metaphor extraction across several languages (Broadwell et al., 2013; Strzalkowski et al., 2013) have shown that word imageability ratings are highly correlated with the presence of metaphors in text. Information about imageability of words can be obtained from the MRC Psycholinguistic Database (MRCPD) for English words and Léxico Informatizado del Español Programa (LEXESP) for Spanish words, which is a collection of human ratings obtained in a series of controlled surveys. Unfortunately, word imageability ratings were collected for only a limited number of words: 9,240 words in English, 6,233 in Spanish; and are unavailable at all in the other two languages studied: Russian and Farsi. The present study describes an automated method for expanding the MRCPD by conferring imageability ratings over the synonyms and hyponyms of existing MRCPD words, as identified in Wordnet. The result is an expanded MRCPD+ database with imagea-bility scores for more than 100,000 words. The appropriateness of this expansion process is assessed by examining the structural coherence of the expanded set and by validating the expanded lexicon against human judgment. Finally, the performance of the metaphor extraction system is shown to improve significantly with the expanded database. This paper describes the process for English MRCPD+ and the resulting lexical resource. The process is analogous for other languages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring inconsistencies can lead you forward: Imageability and the x-ception theory

According to the traditional view, both imageability and concreteness ratings reflect the way word meanings rely on information mediated by the senses. As a consequence, the two measures should and do correlate. The link between these two indexes was already hypothesized and demonstrated by Paivio et al. (1968) in a seminal article, where they introduced the idea of imageability ratings for the...

متن کامل

Valence, arousal, familiarity, concreteness, and imageability ratings for 292 two-character Chinese nouns in Cantonese speakers in Hong Kong

Words are frequently used as stimuli in cognitive psychology experiments, for example, in recognition memory studies. In these experiments, it is often desirable to control for the words' psycholinguistic properties because differences in such properties across experimental conditions might introduce undesirable confounds. In order to avoid confounds, studies typically check to see if various a...

متن کامل

The Validation of MRCPD Cross-language Expansions on Imageability Ratings

In this article, we present a method to validate a multi-lingual (English, Spanish, Russian, and Farsi) corpus on imageability ratings automatically expanded from MRCPD (Liu et al., 2014). We employed the corpus (Brysbaert et al., 2014) on concreteness ratings for our English MRCPD+ validation because of lacking human assessed imageability ratings and high correlation between concreteness ratin...

متن کامل

Automatically Generated Affective Norms of Abstractness, Arousal, Imageability and Valence for 350 000 German Lemmas

Abstract This paper presents a collection of 350 000 German lemmatised words, rated on four psycholinguistic affective attributes. All ratings were obtained via a supervised learning algorithm that can automatically calculate a numerical rating of a word. We applied this algorithm to abstractness, arousal, imageability and valence. Comparison with human ratings reveals high correlation across a...

متن کامل

Nencki Affective Word List (NAWL): the cultural adaptation of the Berlin Affective Word List–Reloaded (BAWL-R) for Polish

In the present article, we introduce the Nencki Affective Word List (NAWL), created in order to provide researchers with a database of 2,902 Polish words, including nouns, verbs, and adjectives, with ratings of emotional valence, arousal, and imageability. Measures of several objective psycholinguistic features of the words (frequency, grammatical class, and number of letters) are also controll...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014